Competition and adaptation in an Internet evolution model 
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We model the evolution of the Internet at the Autonomous System level as a process of competition 
for users and adaptation of bandwidth capability. We find the exponent of the degree distribution 
as a simple function of the growth rates of the number of autonomous systems and the total number 
of connections in the Internet, both empirically measurable quantities. This fact place our model 
apart from others in which this exponent depends on parameters that need to be adjusted in a model 
dependent way. Our approach also accounts for a high level of clustering as well as degree-degree 
correlations, both with the same hierarchical structure present in the real Internet. Further, it also 
highlights the interplay between bandwidth, connectivity and traffic of the network. 
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A statistical physics approach to Internet modeling will 
be successful only if its large-scale properties can be ex- 
plained and predicted on the basis of the interactions be- 
tween basic units at the microscopic level [lll^. Dynami- 
cal evolution rules acting at the local scale would then de- 
termine the behavior and the emergent structural prop- 
erties of the whole Internet, which self-organizes under an 
absolute lack of centralized control y, Ij] • This approach 
is at the core of a set of recent network models focusing on 
evolution, which recognise growth as one of the key mech- 
anisms on network formation, along withpreferential at- 
tachment or other utility rules H IE Ilia H 113 • While 
several of such models succeed in depicting some of the 
Internet features, none of them accounts for a complete 
description of the real topology |llUl2Lll3 l|. In this paper, 
we present a new growing network model which, from 
competition and adaptation mechanisms, reproduces the 
topological properties observed in the autonomous sys- 
tem level maps of the Internet, namely: i) a scale-free 
distribution of the number of connections -or degree- of 
vertices ki, characterized by a power law P{k) ^ k^'^ , 
2.1 < 7 < 2.5, ii) high clustering coefficient Ck, defined 
as the ratio between the number of connected neighbors 
of a node of degree k and the maximum possible value 
averaged for all nodes of degree k, and, finally, iii) disas- 
sortative degree-degree correlations, quantified by means 
of the average nearest neighbors degree of nodes of degree 

fS, rCnnyk) T2J. 

We start our analysis by looking at the growth of the 
Internet during the last three decades. We focus on 
the temporal evolution of the number of hosts present 
in the Internet |l4| as compared to the number of dis- 
tinct autonomous systems (ASs) and the total number 
of connections among them. We have reanalysed AS 
maps collected by the Oregon route-views project which 
has recorded the Internet topology at the AS level since 
November 1997 [13 . Let W{t), N{t) and E{t) be the 
total number of hosts (we assume that number of hosts 
is equivalent to number of users), number of ASs and 
edges among ASs at time t respectively. FigQ] shows 



empirical measurements for these quantities revealing 
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** respectively, with rates a — 0.036, 
0.033, where a > 5 > /3. These expo- 



exponential growths 
and Eit) ~ Eoe 
P = 0.029 and S 
nential growths, in turn, determine the scaling relations 
with the system size, that is, W ~ N"/!^, E ~ N^/^ and 
ik) ~ N^/^~^ uM- ^11 three rates are, indeed, quite close 
to each other. This result poses the question of whether 
these inequalities actually hold or, in contrast, are due to 
statistical fluctuations. A simple argument will convince 
us that the inequalities are, actually, the natural answer. 
There are two mechanisms capable to compensate an in- 
crease in the number of users: the creation of new ASs 
and the creation of new connections by old ASs. When 
both mechanisms take place simultaneously, the rate of 
growth of new ASs, /?, must necessarily be smaller than 
a, whereas the rate of growth of the number of connec- 
tions, 5, must be greater than (3. Any other situation 
would lead to an imbalance between the number of users 
and the maximum number of users that the system can 
manage. 

Our model is defined according to the following rules: 
(i) At rate aWit), new users joint the system and 
choose provider i according to some preference function, 
n,i({wj(i)}), where ^jJjit), j — 1, • • • ,A^(i), is the num- 
ber of hosts connected to AS j at time t. The function 
HiiliVjit)}) is normalised so that J2i^ii{^ji^)}) = 1 at 
any time, (ii) At rate /3iV(t), new ASs join the network 
with an initial number of users, loq, randomly withdrawn 
from the pool of users already attached to existing ASs. 
Therefore, ujq can be understood as the minimum num- 
ber of users required to keep ASs in business, (iii) At rate 
A, each user changes his provider and chooses a new one 
using the same preference function Ilii{LUjit)}). Finally, 
(iv) each node tries to adapt its number of connections 
to other nodes according to its present number of users, 
in an attempt to provide them an adequate access to the 
Internet. We will discuss this last point in the second 
part of the work. With the above ingredients, in the con- 
tinuum approximation, the dynamics of single nodes is 



described by the stochastic differential equation 

'^=A{u^,,t) + [D{u,,tf"m. (1) 

where A{iOi^ t) is a time dependent drift given by 

A{uj^,t) = (a + X)W{t)U, - Xujr - PujQ (2) 

and the diffusion term by 

D{lu,, t) = {a + X)W{t)n^ + Xlu^ + pLUo - 2XujiIl^. (3) 

AppHcation of the Central Limit Theorem guaranties the 
convergence of the noise ^{t) to a gaussian white noise in 
the limit W{t) 3> 1. The first term in the right hand 
side in Eq. (|21) is a creation term accounting for new and 
old users that choose node z as a provider. The second 
term represent those users who decide to change their 
providers and, finally, the last term corresponds to the 
decrease of users due to introduction of newly created 
ASs. To proceed further, we need to specify the prefer- 
ence function Ili{{ujj{t)}). We assume that, as a result 
of a competition process, bigger ASs get users more eas- 
ily than small ones. The simplest function satisfying this 
condition corresponds to the linear preference, that is. 
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n.(K(i)}) = 



W{t) 



(4) 



where W{t) — ojqNo exp (at). In this case, the stochastic 
differential equation ^ reads 



duj. 



^ = aw, - /3wo + [(a + 2 A)w, -t- /3wo] '^' m . (5) 



Notice that reallocation of users {i.e. the A-term) only 
increases the diffusive part in Eq. Q but has no net ef- 
fect in the drift term, which is, eventually, the leading 
term. The complete solution of this problem requires to 
solve the Fokker-Plack equation corresponding to Eq. Q 
with a reflecting boundary condition at uj — loq and ini- 
tial conditions p{uJi,ti\uJo,ti) ~ d{uji — wq) ((5(-) stands 
for the Dirac delta function). Here p{uJi,t\ijjQ,ti) is the 
probability that node i has wealth uji at time t given that 
it had Wo at time ti. The choice of a reflecting boundary 
condition at uj = loq is equivalent to assume that (3 is the 
overall growth rate of the number of nodes, that is, the 
composition of the birth and dead processes ruling the 
evolution of the number of nodes. 

Finding the solution for this problem is not an easy 
task. Fortunately, we can take advantage of the fact that, 
when a > (3, the average number of users of each node in- 
creases exponentially and, since D{uJi,t) = O {A{uji,t)), 
fluctuations vanishes in the long time limit. Under this 
zero noise approximation, the number of hosts connected 
to an AS introduced at time ti is 
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FIG. 1: Temporal evolution of the number of hosts, au- 
tonomous systems and connections among them from Novem- 
ber 1997 to May 2002. Solid lines are the best fit estimates 
which gives the values for the rate growths of a = 0.036, 
f3 = 0.029 and S = 0.033 (units are month"^). 



The probability density function of uj can be calculated 
in the long time limit as 



p{L^,t) ^ Pe-^' I e 
la 
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5{uj - Lu.i{t\U))dti 



which leads to 
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(7) 



(8) 



w,(t|t,) = -^o + (l--Voe"(*-*') 
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where we have defined r = (3/a and the cut-off is given 
by LUc{t) - (1 - T)woe"* - W(t). Thus, in the long time 
limit, p{Lo,t) approaches a stationary distribution with 
an increasing cut-off. In the case of the Internet, a ^ /3 
which implies an exponent smaller but close to 2. A 
similar result was obtained in p| . 

The key point in what follows is how to relate the num- 
ber of users attached to an AS with its degree. Our ba- 
sic assumption is that vertices are continuously adapting 
their bandwidth to the number of users they have. How- 
ever, once an AS decides to increase its bandwidth it has 
to find a peer who, at the same time, wants to increase its 
bandwidth as well. The reason is that connection costs 
among ASs must be assumed by both peers. This fact 
differs from other growing models in which vertices do 
not ask target vertices if they really want to form those 
connections. Our model is, then, to be though of as a 
coupling between a competition process for resources and 
adaptation of vertices to their current situation, with the 
constraint that connections are only formed between ac- 
tive nodes. Let hi{t\ti) be the total bandwidth of an AS 
at time t given that it was introduced at time ti. This 
quantity can include single connections with other ASs, 
i. e. the topological degree k, but it also accounts for 
connections which have higher capacity. This is equiva- 
lent to say that the network is, in fact, weighted and hi is 
the weighted degree. To simplify the model we consider 



that bandwidth is discretized in such a way that single 
connections with high capacity are equivalent to multiple 
connections between the same ASs. Then, when a pair 
of ASs agrees to increase their mutual connectivity the 
connection is newly formed if they were not previously 
connected or, if they were, their mutual bandwidth in- 
creases by one unit. Now, we assume that, at time i, 
each AS adapts its total bandwidth proportionally to its 
number of users. We can write 



h{t\U) = 1 + a{t) {uj^{t\U) - wo) 
Summing Eq. Q for all nodes we get 



a{t) 



2B{t)~N{t) 2B{t) 



W{t) - ujQN{t) W{t) 



(9) 



(10) 



where B{t) is the total bandwidth of the network which 
is, obviously, an upper bound to the total number of 
edges of the network. This suggests that B{t) will grow 
according to B{t) = Bqc^ *, where S' ^ 6. Using this 
assumption, we can express the individual bandwidth as 
b,{t\U) ~ N{t)^^'-°''>/'^uji{t\U). From this equation, the 
scaling of the maximum bandwidth with the system size 
reads bc{t) ~ N{ty/''\ that is, faster than N(t). This im- 
plies that the network must necessarly contain multiple 
connections. Then, we propose that degree and band- 
width are related, in a statistical sense, through the fol- 
lowing scaling relation 



k{t\u) ~ mu)Y 



(11) 



where the scaling exponent, /i < 1, is obtained by impos- 
ing that the maximum degree scales linearly with N{t) 
[l9|- This sets the scaling exponent to fi — P/S'. All four 
growth rates in the model are not independent but can 
be related by exploring the interplay between bandwidth, 
connectivity and traffic of the network. As the number of 
users grow, the global traffic of the Internet also grows, 
which means that ASs do not only adapt their bandwidth 
to their number of users but to the global traffic of the 
network. Therefore, a{t) must be an increasing function 
of t, which, in turn, implies that S' > a. Using this con- 
dition and summing Eq. (|11() for all vertices, the scaling 
of the total number of connections is E{t) ~ N{t)'^^"/^ , 
which leads to S' = aP/{2/3 — 6). Combining this relation 
with Eqs. l|HJl, O and Hll|) . the degree distribution reads 



P(fc) 



r(l - r)n^oa(t)]" 1 



where the exponent 7 takes the value 

/3 



e(fc,(i)-fc), (12) 



7 = 1 



2(3 -S' 



(13) 



Strikingly, the exponent 7 has lost any dependence on a 
becoming a function of the growth rate of both the num- 
ber of ASs and the number of connections of the network. 
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FIG. 2: Cumulative degree distribution (Pc(fc) = Y.h P{k')) 
for the extended AS map compared to simulations of the 
model, r — 0.8. Inset: AS's degree as a function of AS's band- 
width. The solid line stands for the scaling relation Eq. IIH 
with 11 = P/S' = 0.75. 



Using the empirical values for /? and d, the predicted ex- 
ponent is 7 = 2.2 ± 0.1, in excellent agreement with the 
values reported in the literature [llj, Il2l |l3l| . 

So far, we have been mainly interested in the degree 
distribution of the AS map but not in the specific way 
in which the network is formed. To fill this gap we have 
performed numerical simulations that generate network 
topologies in nice agreement with real measures of the 
Internet that go beyond the degree distribution. We 
consider a realistic geographical deployment of ASs and 
physical distance among them to take into account con- 
nection costs [l7|- Our algorithm, following the lines of 
the model, works in four steps: 



1. At iteration t, AW{t) = woiVo(e° 



=a(t-l) 



) users 



join the network and choose provider among the 
existing nodes using the preference rule Eq. Q. 

2. AN{t) = iVo(e'^* - e'^^*"!)) new ASs are introduced 
with ujQ users each, those being randomly with- 
drawn from already existing ASs. Newly created 
ASs are located in a two dimensional plane follow- 
ing a fractal set of dimension Df = 1.5 llTJ. 

3. Each AS evaluate its increase of bandwidth, 
A6j(t|ii), according to Eq. ©. 

4. A pair of nodes, (i,j), is chosen with probabil- 
ity proportional to Abi{t\ti) and Abj{t\tj) respec- 
tively, and, whenever they both need to increase 
their bandwidth, they form a connection with prob- 
ability D{dij,LUi,LUj). This function takes into con- 
sideration that, due to connection costs, physical 
links over long distances are unlikely to be created 
by small peers. Once the first connection has been 
formed, they create a new connection with proba- 
bility r, whenever they still need to increase their 



bandwidth. This step is repeated until all nodes 
have the desired bandwidth. 

It is important to stress the fact that nodes must be 
chosen with probability proportional to their increase in 
bandwidth at each step. The reason is that those nodes 
that need a high bandwidth increase will be more active 
when looking for partners to whom form connections. 
Another important point is the role of the parameter 
r. This parameter takes into account the balance be- 
tween the costs of forming connections with new peers 
and the need for diversification in the number of part- 
ners. The effect of r in the network topology is to tune 
the average degree and the clustering coefficient by cre- 
ating more multiple connections. The exponent of the 
degree distribution is unaffected except in the limiting 
case r — > 1. In this situation, big peers will create a 
huge amount of multiple connections among them, re- 
ducing, thus, the maximum degree of the network. Fi- 
nally, we chose an exponential form for the distance prob- 
ability function D{dij,u)i,u}j) — e^d.ij/da(uji,ujj) ^ where 
dc{uJi, ujj) = (jJiUjj / KW{t) and k is a cost function of num- 
ber of users per unit distance, depending on the maxi- 
mum distance of the fractal set. All simulations are per- 
formed using Wo = 5000, TVq = 2, Bq = 1, a = 0,035, 
/? = 0, 03, and 5' = 0, 04, and the final size of the net- 
works is A^ ^ 11000. Simulations will be compared to the 
AS-|- extended map recorded on May 2001, as reported 
in [13I that offers a better picture of the actual map. 

Fig. 121 shows simulation results for the cumulative de- 
gree distribution, in nice agreement to that measured for 
the AS-I- map. The inset exhibits simulation results of 
the AS's degree as a function of the AS's bandwidth, con- 
forming to the scaling relation in Eq. Hll() . Clustering co- 
efficient and average nearest neighbors degree are showed 
in Fig. 131 Dashed lines result from the model without 
distance constraints, whereas squares correspond to the 
model with distance constraints. Interestingly, the high 
level of clustering coming out from the model arises as a 
consequence of the pattern followed to attach nodes, so 
that only those AS willing for new connections will link. 
As can be observed in the figures, distance constraints in- 
troduce a disassortative component by inhibiting connec- 
tions between small ASs so that the hierarchical structure 
of the real network is better reproduced. 

We conclude by pointing out that this work is a first 
attempt towards a more realistic and complete modeling 
of the Internet, which, for instance, is of utmost impor- 
tance in new communication protocols testing, which can 
be very sensitive to topological details. We would like to 
stress that the relevance of our model resides in the ro- 
bustness of a simple statistical physics approach and, as a 
result, the unprecedented completedness of the topologi- 
cal description of the Internet and the novel insights into 
the dynamical processes leading to network formation. 
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FIG. 3: Clustering coefficient, c^ (top) and normalised av- 
erage nearest neighbors degree, knn{k){k)/{k ) (bottom), as 
functions of the node's degree for the extended autonomous 
system map (circles) and for the model with and without dis- 
tance constraints (red squares and dashed line, respectively). 
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